Hackensack
Sampling from Constrained Gibbs Measures: with Applications to High-Dimensional Bayesian Inference
Wang, Ruixiao, Chen, Xiaohong, Chewi, Sinho
This paper considers a non-standard problem of generating samples from a low-temperature Gibbs distribution with \emph{constrained} support, when some of the coordinates of the mode lie on the boundary. These coordinates are referred to as the non-regular part of the model. We show that in a ``pre-asymptotic'' regime in which the limiting Laplace approximation is not yet valid, the low-temperature Gibbs distribution concentrates on a neighborhood of its mode. Within this region, the distribution is a bounded perturbation of a product measure: a strongly log-concave distribution in the regular part and a one-dimensional exponential-type distribution in each coordinate of the non-regular part. Leveraging this structure, we provide a non-asymptotic sampling guarantee by analyzing the spectral gap of Langevin dynamics. Key examples of low-temperature Gibbs distributions include Bayesian posteriors, and we demonstrate our results on three canonical examples: a high-dimensional logistic regression model, a Poisson linear model, and a Gaussian mixture model.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New Jersey > Bergen County > Hackensack (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
Differentially Private Optimization with Sparse Gradients
Motivated by applications of large embedding models, we study differentially private (DP) optimization problems under sparsity of individual gradients. We start with new near-optimal bounds for the classic mean estimation problem but with sparse data, improving upon existing algorithms particularly for the high-dimensional regime.
- South America > Chile (0.04)
- North America > United States > New Jersey > Bergen County > Hackensack (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- North America > United States > New York (0.04)
- North America > United States > California (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- (7 more...)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > New Jersey > Bergen County > Hackensack (0.04)
- (3 more...)
Gradient Flossing: Improving Gradient Descent through Dynamic Control of Jacobians
Training recurrent neural networks (RNNs) remains a challenge due to the instability of gradients across long time horizons, which can lead to exploding and vanishing gradients. Recent research has linked these problems to the values of Lyapunov exponents for the forward-dynamics, which describe the growth or shrinkage of infinitesimal perturbations. Here, we propose gradient flossing, a novel approach to tackling gradient instability by pushing Lyapunov exponents of the forward dynamics toward zero during learning.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Czechia > South Moravian Region > Brno (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- (5 more...)
Gradient Flossing: Improving Gradient Descent through Dynamic Control of Jacobians
Training recurrent neural networks (RNNs) remains a challenge due to the instability of gradients across long time horizons, which can lead to exploding and vanishing gradients. Recent research has linked these problems to the values of Lyapunov exponents for the forward-dynamics, which describe the growth or shrinkage of infinitesimal perturbations. Here, we propose gradient flossing, a novel approach to tackling gradient instability by pushing Lyapunov exponents of the forward dynamics toward zero during learning.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Czechia > South Moravian Region > Brno (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- (5 more...)
What Functions Does XGBoost Learn?
Ki, Dohyeong, Guntuboyina, Adityanand
This paper establishes a rigorous theoretical foundation for the function class implicitly learned by XGBoost, bridging the gap between its empirical success and our theoretical understanding. We introduce an infinite-dimensional function class $\mathcal{F}^{d, s}_{\infty-\text{ST}}$ that extends finite ensembles of bounded-depth regression trees, together with a complexity measure $V^{d, s}_{\infty-\text{XGB}}(\cdot)$ that generalizes the $L^1$ regularization penalty used in XGBoost. We show that every optimizer of the XGBoost objective is also an optimizer of an equivalent penalized regression problem over $\mathcal{F}^{d, s}_{\infty-\text{ST}}$ with penalty $V^{d, s}_{\infty-\text{XGB}}(\cdot)$, providing an interpretation of XGBoost as implicitly targeting a broader function class. We also develop a smoothness-based interpretation of $\mathcal{F}^{d, s}_{\infty-\text{ST}}$ and $V^{d, s}_{\infty-\text{XGB}}(\cdot)$ in terms of Hardy--Krause variation. We prove that the least squares estimator over $\{f \in \mathcal{F}^{d, s}_{\infty-\text{ST}}: V^{d, s}_{\infty-\text{XGB}}(f) \le V\}$ achieves a nearly minimax-optimal rate of convergence $n^{-2/3} (\log n)^{4(\min(s, d) - 1)/3}$, thereby avoiding the curse of dimensionality. Our results provide the first rigorous characterization of the function space underlying XGBoost, clarify its connection to classical notions of variation, and identify an important open problem: whether the XGBoost algorithm itself achieves minimax optimality over this class.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New Jersey > Bergen County > Hackensack (0.04)
- (4 more...)
On detection probabilities of link invariants
Kelomäki, Tuomas, Lacabanne, Abel, Tubbenhauer, Daniel, Vaz, Pedro, Zhang, Victor L.
We prove that the detection rate of n-crossing alternating links by many standard link invariants decays exponentially in n, implying that they detect alternating links with probability zero. This phenomenon applies broadly, in particular to the Jones and HOMFLYPT polynomials and integral Khovanov homology. We also use a big-data approach to analyze knots and provide evidence that, for knots as well, these invariants exhibit the same asymptotic failure of detection.
- Oceania > Australia > New South Wales (0.04)
- North America > United States > New York (0.04)
- North America > United States > New Jersey > Bergen County > Hackensack (0.04)
- (6 more...)
- North America > United States > New Jersey > Bergen County > Hackensack (0.04)
- Europe > Netherlands (0.04)
- Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)
- Europe > France (0.04)
Infinite-Dimensional Operator/Block Kaczmarz Algorithms: Regret Bounds and $λ$-Effectiveness
Jeong, Halyun, Jorgensen, Palle E. T., Kwon, Hyun-Kyoung, Song, Myung-Sin
We present a variety of projection-based linear regression algorithms with a focus on modern machine-learning models and their algorithmic performance. We study the role of the relaxation parameter in generalized Kaczmarz algorithms and establish a priori regret bounds with explicit $λ$-dependence to quantify how much an algorithm's performance deviates from its optimal performance. A detailed analysis of relaxation parameter is also provided. Applications include: explicit regret bounds for the framework of Kaczmarz algorithm models, non-orthogonal Fourier expansions, and the use of regret estimates in modern machine learning models, including for noisy data, i.e., regret bounds for the noisy Kaczmarz algorithms. Motivated by machine-learning practice, our wider framework treats bounded operators (on infinite-dimensional Hilbert spaces), with updates realized as (block) Kaczmarz algorithms, leading to new and versatile results.
- North America > United States > New York > Albany County > Albany (0.14)
- North America > United States > Iowa > Johnson County > Iowa City (0.14)
- North America > United States > New York > Montgomery County > Amsterdam (0.04)
- (7 more...)
- Research Report (0.50)
- Workflow (0.46)
- Instructional Material (0.45)